AITopics

Country:

North America > United States (0.67)
Europe > Germany (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-15-2026, 11:55:10 GMT

7e810b2c75d69be186cadd2fe3febeab-Paper-Conference.pdf

discovery, machine learning, natural language, (19 more...)

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.47)

Industry:

Transportation > Air (1.00)
Media > News (1.00)
Law (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Neural Information Processing SystemsFeb-11-2026, 06:16:07 GMT

46b065f7d301a15a23909f6cad409a97-Paper-Conference.pdf

artificial intelligence, machine learning, natural language, (19 more...)

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceNov-13-2025

Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework

Kopf, Laura, Feldhus, Nils, Bykov, Kirill, Bommer, Philine Lou, Hedström, Anna, Höhne, Marina M. -C., Eberle, Oliver

Automated interpretability research aims to identify concepts encoded in neural network features to enhance human understanding of model behavior. Within the context of large language models (LLMs) for natural language processing (NLP), current automated neuron-level feature description methods face two key challenges: limited robustness and the assumption that each neuron encodes a single concept (monosemanticity), despite increasing evidence of polysemanticity. This assumption restricts the expressiveness of feature descriptions and limits their ability to capture the full range of behaviors encoded in model internals. To address this, we introduce Polysemantic FeatuRe Identification and Scoring Method (PRISM), a novel framework specifically designed to capture the complexity of features in LLMs. Unlike approaches that assign a single description per neuron, common in many automated interpretability methods in NLP, PRISM produces more nuanced descriptions that account for both monosemantic and polysemantic behavior. We apply PRISM to LLMs and, through extensive benchmarking against existing methods, demonstrate that our approach produces more accurate and faithful feature descriptions, improving both overall description quality (via a description score) and the ability to capture distinct concepts when polysemanticity is present (via a polysemanticity score).

large language model, machine learning, natural language, (20 more...)

2506.15538

Country:

North America (0.93)
Europe > Germany (0.68)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hassan, Syed Zohaib, Halvorsen, Pål, Johnson, Miriam S., Lison, Pierre

Evaluating LLMs on Generating Age-Appropriate Child-Like Conversations

arXiv.org Artificial IntelligenceOct-29-2025

Large Language Models (LLMs), predominantly trained on adult conversational data, face significant challenges when generating authentic, child-like dialogue for specialized applications. We present a comparative study evaluating five different LLMs (GPT-4, RUTER-LLAMA-2-13b, GPTSW, NorMistral-7b, and NorBloom-7b) to generate age-appropriate Norwegian conversations for children aged 5 and 9 years. Through a blind evaluation by eleven education professionals using both real child interview data and LLM-generated text samples, we assessed authenticity and developmental appropriateness. Our results show that evaluators achieved strong inter-rater reliability (ICC=0.75) and demonstrated higher accuracy in age prediction for younger children (5-year-olds) compared to older children (9-year-olds). While GPT-4 and NorBloom-7b performed relatively well, most models generated language perceived as more linguistically advanced than the target age groups. These findings highlight critical data-related challenges in developing LLM systems for specialized applications involving children, particularly in low-resource languages where comprehensive age-appropriate lexical resources are scarce.

age group, large language model, machine learning, (20 more...)

2510.2425

Country:

North America > United States (0.46)
Europe > Norway (0.30)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-8-2025, 23:33:38 GMT

7e810b2c75d69be186cadd2fe3febeab-Paper-Conference.pdf

artificial intelligence, machine learning, natural language, (19 more...)

Country:

North America > Canada (0.14)
Africa > Middle East > Egypt (0.14)
Africa > Nigeria (0.14)
(15 more...)

Genre: Research Report > Experimental Study (0.47)

Industry:

Transportation > Air (1.00)
Media > News (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Neural Information Processing SystemsOct-8-2025, 14:40:09 GMT

46b065f7d301a15a23909f6cad409a97-Paper-Conference.pdf

dr loss, robust accuracy, robustness, (16 more...)

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Tian, Claire, Tian, Katherine, Hu, Nathan

Measuring Sparse Autoencoder Feature Sensitivity

arXiv.org Artificial IntelligenceSep-30-2025

Sparse Autoencoder (SAE) features have become essential tools for mechanistic interpretability research. SAE features are typically characterized by examining their activating examples, which are often "monosemantic" and align with human interpretable concepts. However, these examples don't reveal feature sensitivity: how reliably a feature activates on texts similar to its activating examples. In this work, we develop a scalable method to evaluate feature sensitivity. Our approach avoids the need to generate natural language descriptions for features; instead we use language models to generate text with the same semantic properties as a feature's activating examples. We then test whether the feature activates on these generated texts. We demonstrate that sensitivity measures a new facet of feature quality and find that many interpretable features have poor sensitivity. Human evaluation confirms that when features fail to activate on our generated text, that text genuinely resembles the original activating examples. Lastly, we study feature sensitivity at the SAE level and observe that average feature sensitivity declines with increasing SAE width across 7 SAE variants. Our work establishes feature sensitivity as a new dimension for evaluating both individual features and SAE architectures.

large language model, machine learning, natural language, (19 more...)

2509.23717

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

arXiv.org Artificial IntelligenceApr-30-2025

Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection

Fan, Ziqing, Du, Siyuan, Hu, Shengchao, Wang, Pingjie, Shen, Li, Zhang, Ya, Tao, Dacheng, Wang, Yanfeng

Selecting high-quality pre-training data for large language models (LLMs) is crucial for enhancing their overall performance under limited computation budget, improving both training and sample efficiency. Recent advancements in file selection primarily rely on using an existing or trained proxy model to assess the similarity of samples to a target domain, such as high quality sources BookCorpus and Wikipedia. However, upon revisiting these methods, the domain-similarity selection criteria demonstrates a diversity dilemma, i.e.dimensional collapse in the feature space, improving performance on the domain-related tasks but causing severe degradation on generic performance. To prevent collapse and enhance diversity, we propose a DiverSified File selection algorithm (DiSF), which selects the most decorrelated text files in the feature space. We approach this with a classical greedy algorithm to achieve more uniform eigenvalues in the feature covariance matrix of the selected texts, analyzing its approximation to the optimal solution under a formulation of $γ$-weakly submodular optimization problem. Empirically, we establish a benchmark and conduct extensive experiments on the TinyLlama architecture with models from 120M to 1.1B parameters. Evaluating across nine tasks from the Harness framework, DiSF demonstrates a significant improvement on overall performance. Specifically, DiSF saves 98.5% of 590M training files in SlimPajama, outperforming the full-data pre-training within a 50B training budget, and achieving about 1.5x training efficiency and 5x data efficiency.

large language model, machine learning, natural language, (19 more...)

2504.20644

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceDec-15-2024

AD-LLM: Benchmarking Large Language Models for Anomaly Detection

Yang, Tiankai, Nian, Yi, Li, Shawn, Xu, Ruiyao, Li, Yuangang, Li, Jiaqi, Xiao, Zhuo, Hu, Xiyang, Rossi, Ryan, Ding, Kaize, Hu, Xia, Zhao, Yue

Anomaly detection (AD) is an important machine learning task with many real-world uses, including fraud detection, medical diagnosis, and industrial monitoring. Within natural language processing (NLP), AD helps detect issues like spam, misinformation, and unusual user activity. Although large language models (LLMs) have had a strong impact on tasks such as text generation and summarization, their potential in AD has not been studied enough. This paper introduces AD-LLM, the first benchmark that evaluates how LLMs can help with NLP anomaly detection. We examine three key tasks: (i) zero-shot detection, using LLMs' pre-trained knowledge to perform AD without tasks-specific training; (ii) data augmentation, generating synthetic data and category descriptions to improve AD models; and (iii) model selection, using LLMs to suggest unsupervised AD models. Through experiments with different datasets, we find that LLMs can work well in zero-shot AD, that carefully designed augmentation methods are useful, and that explaining model selection for specific datasets remains challenging. Based on these results, we outline six future research directions on LLMs for AD.

category, large language model, machine learning, (19 more...)